Quantifying the Relationship between Hit Count Estimates and Wikipedia Article Traffic

نویسندگان

  • Tina Tian
  • Ankur Agrawal
چکیده

This paper analyzes the relationship between search engine hit counts and Wikipedia article views by evaluating the cross correlation between them. We observe the hit count estimates of three popular search engines over a month and compare them with the Wikipedia page views. The strongest cross correlations are recorded with their delays in days. We present the results in both graphs and quantitative data among different search engines. We also investigate the predicting trends between the hit counts and Wikipedia article traffic. Keywords—hit count estimations; search engines; Wikipedia article traffic; cross correlation; positive delay, negative delay; prediction of Web hosting trend

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

HIT Approaches to Entity Linking at TAC 2011

This paper describes the system of HIT at the 2011 Text Analysis Conference (TAC) Knowledge Base Population (KBP) track English Entity Linking task. Based on structured and unstructured information extracted from Wikipedia, this system predicts the most probable entity that a query mention might refer to. A similarity score is assigned to the candidate entity by computing the the relatedness be...

متن کامل

Web citations in patents: Evidence of technological impact?

Patents sometimes cite webpages either as general background to the problem being addressed or to identify prior publications that limit the scope of the patent granted. Counts of the number of patents citing an organization’s website may therefore provide an indicator of its technological capacity or relevance. This article introduces methods to extract URL citations from patents and evaluates...

متن کامل

Wikipedia and Medicine: Quantifying Readership, Editors, and the Significance of Natural Language

BACKGROUND Wikipedia is a collaboratively edited encyclopedia. One of the most popular websites on the Internet, it is known to be a frequently used source of health care information by both professionals and the lay public. OBJECTIVE This paper quantifies the production and consumption of Wikipedia's medical content along 4 dimensions. First, we measured the amount of medical content in both...

متن کامل

The Substantial Interdependence of Wikipedia and Google: A Case Study on the Relationship Between Peer Production Communities and Information Technologies

While Wikipedia is a subject of great interest in the computing literature, very little work has considered Wikipedia’s important relationships with other information technologies like search engines. In this paper, we report the results of two deception studies whose goal was to better understand the critical relationship between Wikipedia and Google. These studies silently removed Wikipedia c...

متن کامل

The performance of a LRU cache under dynamic catalog traffic

We propose a simple traffic model featuring a dynamic catalog to construct a theoretical estimation of the hit ratio for a LRU cache offered such a traffic regime. We validate the accuracy of our theoretical estimates by computing the empirical hit ratio for real request sequences coming from traces of the Orange network.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015